Wikipedia Missing Link Discovery: A Comparative Study
نویسندگان
چکیده
In this paper, we describe our work on discovering missing links in Wikipedia articles. This task is important for both readers and authors of Wikipedia. The readers will benefit from the increased article quality with better navigation support. On the other hand, the system can be employed to support the authors during editing. This study combines the strengths of different approaches previously applied for the task, and adds its own techniques to reach satisfactory results. Because of the subjectivity in the nature of the task; automatic evaluation is hard to apply. Comparing approaches seems to be the best method to evaluate new techniques, and we offer a semi-automatized method for evaluation of the results. The recall is calculated automatically using existing links in Wikipedia. The precision is calculated according to manual evaluations of human assessors. Comparative results for different techniques are presented, showing the success of our improvements. We employ Turkish Wikipedia, we are the first to study on it, to examine whether a small instance is scalable enough for such purposes.
منابع مشابه
Experiments and Evaluation of Link Discovery in the Wikipedia
Collaborative knowledge management systems such as the Wikipedia are becoming ever more popular – and these systems typically contain hypertext links between documents. The Wikipedia offers both manual and automated link creation. In fact several different systems providing links for Wikipedia documents now exit. Problematically the quality of automatically generated links has never been quanti...
متن کاملUsing Explicit Semantic Analysis for Cross-Lingual Link Discovery
This paper explores how to automatically generate cross-language links between resources in large document collections. The paper presents new methods for Cross-Lingual Link Discovery (CLLD) based on Explicit Semantic Analysis (ESA). The methods are applicable to any multilingual document collection. In this report, we present their comparative study on the Wikipedia corpus and provide new insi...
متن کاملLink Discovery in the Wikipedia
In this paper we describe our approaches taken in the Link-the-Wiki track. We submitted runs for all three Link-the-Wiki tasks: Link-the-Wiki, Link-Te-Ara, and Link-Te-Ara-to-the-Wiki. To generate outgoing links for each task, our link discovery system employs the top ranking algorithms from previous LTW tracks and a hybrid method derived from them. For incoming links, we used traditional infor...
متن کاملAutomated Cross-lingual Link Discovery in Wikipedia
At NTCIR-9, we participated in the cross-lingual link discovery (Crosslink) task. In this paper we describe our approaches to discovering Chinese, Japanese, and Korean (CJK) cross-lingual links for English documents in Wikipedia. Our experimental results show that a link mining approach that mines the existing link structure for anchor probabilities and relies on the “translation” using cross-l...
متن کاملThe Methodology of Manual Assessment in the Evaluation of Link Discovery
The link graph extracted from the Wikipedia has often been used as the ground truth for measuring the performance of automated link discovery systems. Extensive manual assessments experiments at INEX 2008 recently showed that this is unsound and that manual assessment is essential. This paper describes the methodology for link discovery evaluation which was developed for use in the INEX 2009 Li...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2010